This simulation study aims to compare frequentist and Bayesian approaches to perform treatment-control comparisons in platform trials using non-concurrent controls (NCC). In this simulation, we restricted the attention to trials with continuous endpoints.
We consider the following approaches:
Frequentist regression model that takes into account all data until the arm under study left the trial and adjusts for periods using a stepwise function. Note: this model is the extension of the one presented in the paper (Bofill Roig et al.), but with more than two arms.
The Bayesian Time Machine (Saville et al.), which uses a second-order Bayesian normal dynamic linear model (NDLM), takes into account all data until the investigated arm left the trial and includes covariate adjustment for time (separating the trial into buckets of pre-defined size) using a hierarchical model that smooths the control response rate over time.
The MAP prior approach (Schmidli et al.), where non-concurrent control data is used to obtain the MAP prior distribution for the control response in the concurrent period.
For comparative purposes, we also analyse the data using the separate approach (CC data only), as well as naive pooling of CC and NCC data.
We simulated platform trials evaluating the efficacy of \(K\) treatment arms compared to a shared control. Arm \(k\) (\(k>1\)) enters after \(d_k\) patients have been recruited to the trial and \(d_1=0\). Patients are allocated to treatment arms and control following 1:…:1 allocation. The duration of the trial is split into \(S\) periods, which are defined as time intervals bounded by any treatment either entering or leaving the trial.
The continuous response \(y_j\) for patient \(j\) was generated according to:
\[E(y_j) = \eta_0 + \sum_{k=1}^K \cdot I(k_j=k) + f(j)\] where \(\eta_0\) and \(\theta_k\) are the response in the control arm and the effect of treatment \(k\).
The function \(f(j)\) denotes the time trend, whose strength is indicated by \(\lambda_{k_j}\) and which can have two patterns:
Linear time trend: \(f(j) = \lambda_{k_j} \cdot \frac{j-1}{N-1}\), where \(N\) is the total sample size in the trial
Stepwise time trend: \(f(j) = \lambda_{k_j} \cdot (c_j - 1)\), where \(c_j\) is an indicator of how many treatment arms have already entered the ongoing trial, when patient \(j\) was enrolled
Inverted-U time trend: \[f(j) = \begin{cases} \lambda \cdot \frac{j-1}{N-1} & \text{for } j \leq N_p \\ -\lambda \cdot \frac{j-N_p}{N-1} + \lambda \cdot \frac{N_p-1}{N-1} & \text{for } j > N_p \end{cases}\]
where \(N_p\) indicates the point at which the trend switches direction
We consider three designs of platform trials with \(K\) treatment arms that enter the trial in a staggered way. Each design corresponds to an objective:
In Design I, we explore the effect of the overlap between arms in trials with \(K=3\) experimental arms, in settings with equal time trends and different time trends.
In Design II, we investigate the impact of different entry times in trials with \(K=4\) arms. For this, we vary the entry time of arm \(3\) and consider equal time trends only.
In Design III, we explore the operating characteristics of more realistic platform trials. We consider \(K=10\) arms, and random time trends.
The considered trial designs are illustrated bellow.
In all three designs, we assume equal sample sizes of 250 in all treatment arms and 1:1:…:1 allocation ratio in each period, and consider different scenarios varying the overlaps between arms indicated by \(\mathbf{d} = (d_1,...,d_K)\). Furthermore, we consider linear, stepwise and inverted-U time trends, which are either equal across all arms, or different in arm 1 or arms 1 and 2 (in this case, only arm 3 is evaluated). For the Bayesian Time Machine, we used bucket sizes of 25 in all designs. Moreover, we assumed effect sizes of \(\theta_{i} = 0.25, i=1,...,K\) for the treatment-control comparisons under the alternative hypothesis. The chosen sample and effect sizes lead to 80% power for the treatment-control comparison using a separate analysis (one-sided t-test at 2.5% significance level).
In the remainder of this report, the following parameters were used for the Time machine:
prec_theta: 0.001prec_eta: 0.001prec_a: 0.001prec_b: 0.001bucket_size: 25For the parameters tau_a and tau_b we
consider the following options based on the expected and maximal jump in
the control response between periods:
Assuming stepwise time trend with \(\lambda=0.15\) and \(d=250\)
| Assumption | Expected jump | Maximal jump | tau_a | tau_b |
|---|---|---|---|---|
| Reasonable jump | 1e-02 | 0.150 | 1.099121 | 0.0001099 |
| Small jump | 1e-03 | 0.015 | 109.912060 | 0.0001099 |
| Large jump | 1e+01 | 15.000 | 11.562213 | 1156.2213391 |
In plots comparing different methods to incorporate NCC, we use the assumption of a reasonable jump of the time trend. In plots showing the calibration of the Time Machine model, we present a comparison of the three assumed jump sizes (only for Design I).
In the remainder of this report, the following parameters were used for the MAPPrior function when compating this approach to other methods:
opt: 2n_samples: 1000n_chains: 4n_iter: 4000n_adapt: 1000robustify: TRUEweight: 0.1prior_prec_eta: 1prior_prec_tau: 0.002In plots showing the calibration of the MAP approach, we consider the
following options for prior_prec_eta and
prior_prec_tau:
prior_prec_eta \(\in \{
0.001, 1 \}\)prior_prec_tau \(\in \{ 2,
0.2, 0.002 \}\)while the other parameters remain unchanged.
In the first design, we examine a platform trial with 3 treatment arms, where treatment arm \(i\) enters after every \(d_i = d \cdot (i-1)\) patients have joined the trial. We consider 5 options for \(d = (0, 125, 250, 375, 500)\), resulting in platform trials with different overlaps between arms, as illustrated below. We consider time trend that are equal across all arms, or that differ either in arm 1, or in arms 1 and 2. In cases with different time trends, only treatment arm 3 is evaluated.
| Trt 1 | Trt 2 | Trt 3 = Control |
|---|---|---|
| \(\lambda\) | \(\lambda\) | \(\lambda\) |
| \(\lambda\) | 0.1 | 0.1 |
| \(\lambda\) | \(\lambda\) | 0.1 |
In the second design, we examine a platform trial with 4 treatment arms, where treatment arms 2 and 4 enter after every \(d_2=300\) and \(d_4=800\) patients have been recruited to the trial, respectively. In this case, there are 5 options for the timing of adding the third treatment arm \(d_3 = (300, 425, 550, 675, 800)\), as illustrated below.
Scenario III-I consists of 10 treatment arms, where treatment arm \(i\) enters after every \(300 \cdot (i-1)\) patients have been recruited to the trial. In this case, only treatment arm 10 is evaluated and has no time trend present, just like the control group (\(\lambda_0 = \lambda_{10} = 0\)). The time trend in the remaining treatment arms is varied with \(\lambda_1=\lambda_2=\ldots=\lambda_9\).
Scenario III-II consists of 10 treatment arms, where treatment arm \(i\) enters after every \(300 \cdot (i-1)\) patients have been recruited to the trial. In this case, only treatment arm 10 is evaluated, while its time trend is varied and is equal to the control group (\(\lambda_0 = \lambda_{10}\)). The time trend in the remaining treatment arms is sampled from \(\lambda_i \sim N(\lambda_0, 0.5), \forall i \in \{1,\ldots,9\}\).